Log-Normal Matrix Completion for Large Scale Link Prediction

نویسندگان

  • Brian Mohtashemi
  • Thomas Ketseoglou
چکیده

The ubiquitous proliferation of online social networks has led to the widescale emergence of relational graphs expressing unique patterns in link formation and descriptive user node features. Matrix Factorization and Completion have become popular methods for Link Prediction due to the low rank nature of mutual node friendship information, and the availability of parallel computer architectures for rapid matrix processing. Current Link Prediction literature has demonstrated vast performance improvement through the utilization of sparsity in addition to the low rank matrix assumption. However, the majority of research has introduced sparsity through the limited L1 or Frobenius norms, instead of considering the more detailed distributions which led to the graph formation and relationship evolution. In particular, social networks have been found to express either Pareto, or more recently discovered, Log Normal distributions. Employing the convexity-inducing Lovasz Extension, we demonstrate how incorporating specific degree distribution information can lead to large scale improvements in Matrix Completion based Link prediction. We introduce LogNormal Matrix Completion (LNMC), and solve the complex optimization problem by employing Alternating Direction Method of Multipliers. Using data from three popular social networks, our experiments yield up to 5% AUC increase over top-performing non-structured sparsity based methods.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Link Prediction using Network Embedding based on Global Similarity

Background: The link prediction issue is one of the most widely used problems in complex network analysis. Link prediction requires knowing the background of previous link connections and combining them with available information. The link prediction local approaches with node structure objectives are fast in case of speed but are not accurate enough. On the other hand, the global link predicti...

متن کامل

Large-Scale Nonparametric Estimation of Vehicle Travel Time Distributions

Fitting distributions of travel-time in vehicle traffic is an important application of spatio-temporal data mining. While regression methods to forecast the expected travel-time are standard approaches of travel-time prediction, we need to estimate distributions of the travel-time when using stateof-the-art risk-sensitive route recommendation systems. The authors introduce a novel nonparametric...

متن کامل

Incremental adaptive networks implemented by free space optical (FSO) communication

The aim of this paper is to fully analyze the effects of free space optical (FSO) communication links on the estimation performance of the adaptive incremental networks. The FSO links in this paper are described with two turbulence models namely the Log-normal and Gamma-Gamma distributions. In order to investigate the impact of these models we produced the link coefficients using these distribu...

متن کامل

Large-scale log-determinant computation through stochastic Chebyshev expansions

Logarithms of determinants of large positive definite matrices appear ubiquitously in machine learning applications including Gaussian graphical and Gaussian process models, partition functions of discrete graphical models, minimum-volume ellipsoids, metric learning and kernel learning. Log-determinant computation involves the Cholesky decomposition at the cost cubic in the number of variables,...

متن کامل

Evaluation and Application of the Gaussian-Log Gaussian Spatial Model for Robust Bayesian Prediction of Tehran Air Pollution Data

Air pollution is one of the major problems of Tehran metropolis. Regarding the fact that Tehran is surrounded by Alborz Mountains from three sides, the pollution due to the cars traffic and other polluting means causes the pollutants to be trapped in the city and have no exit without appropriate wind guff. Carbon monoxide (CO) is one of the most important sources of pollution in Tehran air. The...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1601.07714  شماره 

صفحات  -

تاریخ انتشار 2016